Skip to content

Add database persistence benchmarks#915

Open
benthecarman wants to merge 8 commits into
lightningdevkit:mainfrom
benthecarman:db-bench
Open

Add database persistence benchmarks#915
benthecarman wants to merge 8 commits into
lightningdevkit:mainfrom
benthecarman:db-bench

Conversation

@benthecarman

@benthecarman benthecarman commented May 31, 2026

Copy link
Copy Markdown
Contributor

Closes #908

Benchmark realistic payment and pending-payment persistence workloads
across filesystem, SQLite, and optional PostgreSQL stores. Use the async
KV-store APIs so the measured paths match the database interfaces used by
async persistence.

Also adds support for different db backends to the current payment benchmark.

Benchmark Results

Results on my computer with a Ryzen 7 5800XT 8-Core, 16-Thread CPU

Operations

Bench filesystem sqlite postgres
channel_open 4.99 ms 5.38 ms 14.74 ms
forwarding 5.13 s 5.16 s 5.70 s
channel_open_close 5.07 s 5.20 s 5.08 s
payments 5.69 s 6.26 s 11.73 s

Startup

Scenario filesystem sqlite postgres
channels_1_payments_2 10.37 ms 12.04 ms 57.57 ms
channels_10_payments_2 12.02 ms 16.68 ms 56.77 ms
channels_100_payments_2 18.37 ms 50.61 ms 62.03 ms
channels_100_payments_1000 17.77 ms 70.26 ms 61.86 ms

Database Hot Path

Bench filesystem sqlite postgres
channel_open_like 0.26 ms 0.46 ms 7.54 ms
forwarding_25_like 5.00 ms 10.56 ms 124.89 ms

@ldk-reviews-bot

ldk-reviews-bot commented May 31, 2026

Copy link
Copy Markdown

🎉 This PR is now ready for review!
Please choose at least one reviewer by assigning them on the right bar.
If no reviewers are assigned within 10 minutes, I'll automatically assign one.
Once the first reviewer has submitted a review, a second will be assigned if required.

@benthecarman benthecarman force-pushed the db-bench branch 3 times, most recently from 5f75dd9 to bc97c3a Compare June 2, 2026 22:19
@tnull

tnull commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Feel free to rebase on top of #919 and let me know if you see any differences in the benchmarks.

@benthecarman

benthecarman commented Jun 3, 2026

Copy link
Copy Markdown
Contributor Author

codex summary:

Benchmark Results

Payment Benchmarks

Backend Previous Current Change
filesystem 6.2475 s 6.2138 s slightly faster
sqlite 6.7378 s 6.6388 s faster
postgres 17.669 s 17.489 s slightly faster; Criterion reported no significant change

Lower is better. Delta compares the async-branch rerun against the previous saved Criterion results.

Bench Backend Previous Async rerun Delta
single write new filesystem 1.844 ms 1.847 ms +0.2%
single write new sqlite 2.673 ms 2.658 ms -0.6%
single write new postgres 636.0 us 630.1 us -0.9%
single write existing filesystem 1.860 ms 1.881 ms +1.1%
single write existing sqlite 22.1 us 22.5 us +2.0%
single write existing postgres 638.1 us 624.9 us -2.1%
single read filesystem 14.7 us 15.1 us +2.2%
single read sqlite 17.0 us 17.3 us +1.7%
single read postgres 145.0 us 146.2 us +0.8%
single remove filesystem 908.8 us 920.0 us +1.2%
single remove sqlite 2.842 ms 2.728 ms -4.0%
single remove postgres 1.445 ms 1.470 ms +1.8%
warm insert 100 filesystem 185.626 ms 185.919 ms +0.2%
warm insert 100 sqlite 276.593 ms 269.530 ms -2.6%
warm insert 100 postgres 63.417 ms 62.791 ms -1.0%
concurrent distinct 100 filesystem 10.964 ms 10.909 ms -0.5%
concurrent distinct 100 sqlite 272.638 ms 261.112 ms -4.2%
concurrent distinct 100 postgres 9.177 ms 9.230 ms +0.6%
concurrent same-key 100 filesystem 11.495 ms 13.050 ms +13.5%
concurrent same-key 100 sqlite 249.663 ms 191.029 ms -23.5%
concurrent same-key 100 postgres 58.739 ms 56.400 ms -4.0%
cold insert 100 filesystem 185.829 ms 183.843 ms -1.1%
cold insert 100 sqlite 266.497 ms 307.448 ms +15.4%
cold insert 100 postgres 74.440 ms 72.031 ms -3.2%
cold update 100 filesystem 188.561 ms 186.808 ms -0.9%
cold update 100 sqlite 247.062 ms 246.978 ms -0.0%
cold update 100 postgres 78.208 ms 74.185 ms -5.1%
reload 100 filesystem 2.995 ms 3.002 ms +0.2%
reload 100 sqlite 3.532 ms 3.499 ms -0.9%
reload 100 postgres 3.237 ms 3.152 ms -2.7%
first page from 10k filesystem 25.290 ms 25.090 ms -0.8%
first page from 10k sqlite 29.8 us 24.8 us -17.0%
first page from 10k postgres 184.5 us 182.3 us -1.2%
second page from 10k filesystem 50.665 ms 50.422 ms -0.5%
second page from 10k sqlite 66.5 us 50.8 us -23.7%
second page from 10k postgres 376.7 us 372.7 us -1.1%
lifecycle insert-update-read filesystem 6.786 ms 3.736 ms -44.9%
lifecycle insert-update-read sqlite 9.866 ms 5.216 ms -47.1%
lifecycle insert-update-read postgres 2.533 ms 1.617 ms -36.2%
pending insert 100 filesystem 273.515 ms 183.380 ms -33.0%
pending insert 100 sqlite 333.859 ms 262.138 ms -21.5%
pending insert 100 postgres 75.244 ms 75.999 ms +1.0%
pending update 100 filesystem 197.544 ms 187.656 ms -5.0%
pending update 100 sqlite 257.055 ms 264.137 ms +2.8%
pending update 100 postgres 75.756 ms 63.715 ms -15.9%

Takeaways

  • The async branch materially improves the lifecycle workload across all stores: filesystem improved by 44.9%, sqlite by 47.1%, and Postgres by 36.2%.
  • Postgres pending-payment updates improved meaningfully, dropping from 75.756 ms to 63.715 ms, a 15.9% reduction.
  • Most single-operation Postgres results are roughly unchanged, with small improvements on writes and small regressions/noise on reads/removes.
  • Postgres cold payment-store writes improved modestly: inserts improved by 3.2% and updates by 5.1%.
  • SQLite saw large improvements in paginated reads and same-key concurrent writes, but cold insert regressed by 15.4%, so that path is worth a closer look.
  • Payment benchmarks are still not trustworthy for comparison because the workload does not complete cleanly and times out waiting for HTLC slots.

@joostjager

Copy link
Copy Markdown
Contributor

It would have been nice if async persistence would show a clear performance win here. The store-level smoke tests are useful, but the closest thing to real node load seems to be the payment benchmark, and that does not show a meaningful improvement, putting aside that it is marked as not reliable yet.

Does this tell us much about high-load node performance in practice? If not, what would be the next useful benchmark?

@tnull

tnull commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

It would have been nice if async persistence would show a clear performance win here. The store-level smoke tests are useful, but the closest thing to real node load seems to be the payment benchmark, and that does not show a meaningful improvement, putting aside that it is marked as not reliable yet.

Hmm, note that 'async persistence' means pre and post #919 AFAIU? I.e., it does not refer to LDK's async persistence itself, which we switched to pre-#919.

@benthecarman

Copy link
Copy Markdown
Contributor Author

Hmm, note that 'async persistence' means pre and post #919 AFAIU? I.e., it does not refer to LDK's async persistence itself, which we switched to pre-#919.

yeah that is pre/post 919

@benthecarman

Copy link
Copy Markdown
Contributor Author

Does this tell us much about high-load node performance in practice? If not, what would be the next useful benchmark?

Added a few more with start up time, forwarding payments, and channel opens. But working on making these more robust

@benthecarman benthecarman force-pushed the db-bench branch 2 times, most recently from 6bd3945 to fedeaf4 Compare June 5, 2026 23:51
AI-Assisted-By: OpenAI Codex
Benchmark realistic payment and pending-payment persistence workloads across
filesystem, SQLite, and optional PostgreSQL stores. Use the async KV-store
APIs so the measured paths match the database interfaces used by async
persistence.

AI-Assisted: Codex
Run the existing payments benchmark once per configured store backend so
filesystem, SQLite, and optional PostgreSQL results are reported under the
same payment flow.

AI-assisted-by: OpenAI Codex
Wait for the cleanup payment to settle before starting the next payment
benchmark sample. This keeps the measured forward-payment duration intact
while avoiding HTLC state leaking into later samples.

AI tools: Created with assistance from OpenAI Codex.
Add an operations bench target with a forwarding benchmark that compares
sqlite, filesystem, and postgres stores over a settled multi-hop payment.

AI-assisted-by: OpenAI Codex
Add a channel-open benchmark that measures the open_channel call while
leaving chain confirmation cleanup outside the timed section.

AI-assisted-by: OpenAI Codex
Add migratable key listing for SQLite and PostgreSQL stores so benchmark
fixtures can copy persisted node state between database backends.

AI-assisted-by: OpenAI Codex
Add a startup benchmark that restarts a node whose store already contains
channel and payment data, so startup cost reflects persisted node state.

AI-assisted-by: OpenAI Codex
@benthecarman benthecarman requested a review from tnull June 9, 2026 15:25
@benthecarman benthecarman marked this pull request as ready for review June 9, 2026 15:25
@benthecarman

benthecarman commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

Since #919 is merged marking this ready for review.

Hopefully we get CI re-enabled because I am not sure how much these changes effect how long the bench CI job takes. The whole suite takes awhile so I tried trimming it down for CI. That subset takes a few minutes on my computer but unsure with the CI computers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Benchmark database options

4 participants